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Abstract 



Test data generated according io two different multidimensional item 
response theory models were compared at both the item response level and the 
test score level to determine if measurable differences between the models could 
be detected wiiep the data sets were constrained to be equivalent in terms of item 
/7-values. Although differences could be detected at the item level, these 
differences decreased as the correlation between examinee abilities increased. 
Furthermore, these item differences were small in magnitude and could be 
considered unimportant or insignificant from a practical standpoint. No 
differences were found at the total test score level, and it was concluded that, at 
least for the data used in this study, the models were indistinguishable. 



Comparison of Two Logistic Multidimensional Item 
Response Theory Models 



PsychomeLricians who have some interest in multidimensional item 
response theory (MIRT) modeling may be familiar with the terms, compensatory 
and noncompensatory as they relate to two general model classification schemes, 
Ansley and Forsyth (1985) contrasted the two types of model classifications as 
follows. "Compensatory models, unlike noncompensatory models, permit high 
ability on one dimension to compensate for low ability on ai.other dimension in 
terms of probability of correct response. In the noncompensatory models, the 
minimum factor (probability) in the denominator is the upper bound for the 
probability of a correct response. Thus, for a two-dimensional item, a person with 
a very low ability on one dimension and very high ability on the other has a very 
low probability of correctly answering the item" (p, 40;. 

Typically, MIRT models of the compensatory type, such as the logistic 
MIRT model (Doody-Bogan & Yen, 1983; Hattie, i981; Reckase, 1985, 1986) or 
the normal ogive MIRT model (Samejima, 1974) imply linear combinations of the 
multidimensional abilities in the exponent of the expression for the probability of 
a correct response. In this linear fashion, a low ability on one or more of the k 
ability dimensions can be compensated by a higher ability on one or more of the 
remaining dimensions. Because the compensation is a characteristic of this linear 
combination, such models are probably more accurately labeled linear MIRT 
models. A typical linear logistic MIRT model of the compensatory type can be 
written as 



k 



e 




(1) 



k 



1 + e 
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where 

Cj = the pseudo-guessing parameter of the ;th item, 
a-^ = the discrimination parameter for the ;th item on 
the mth dimension, 
= the difficulty parameter for the jih item, and 
Bj^ = the mth element in the ith person's ability vector. 

In this model the favorable response probability, P^W, is bounded from 
below by Cj. However, because the upper bound of P^iB^) is not a function of any 

k 

one ability dimension, it increases monotonically as s f increases. 

On the other hand, noncompensatory MIRT models (Sympson, 1978; 
Embretson, 1984) describe the prchiibility of a favorable response in terms of a 
product of k functions of ability on a single dimension and item characteristics. In 
its most common form, a logistic MIRT model of this noncompensatory or 
multiplicative type can be written as 

A- e^'i"" (2) 

m-l (1 +^ 'i^N 

where now we let f,j^ = [a^^ (6,^ - b^^)] with b^^ = the difficulty parameter for the 
yth item on the mth dimension, is bounded by an upper asymptote equal to 

the minimum of exp{f,j^}/(l + exp{f,j^}), and the lower asymptote, for any given 
examinee with 9 = 6j, Thus, the noncompensatory nature of the model is due to 
the fact that P^i^) can never be greater than the minimum value of the terms in 
the product, exp{f,j^}/(l + exp{f,j^}), a function of the smallest value of the k 
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ability dimensions for a given examinee. Because of its multiplicative form, the 
model is more generally labeled as a multiplicative MIRT model. 

Researchers have used the multiplicative MIRT model to examine 
characteristics of unidimensional item response theory parameter estimates 
derived from MIRT response data (Ansley & Forsyth, 1985) and to model certain 
multicomponent latent traits in response processes (Embretson, 1984). Reckase 
(1985) has used a linear MIRT model on real response data to estimate two- 
dimensional item and person parameters on an ACT Assessment Mathematics 
Usage test. However, no one has actually shown that one model is more 
representative of the actual item-examinee response process than the other. It 
may even be possible that one model may be appropriate under one set of 
circumstances while the other type may be more appropriate in other situations. 

In this paper we investigate the differences between item responses 
generated by these two logistic MIRT models. We have been interested in 
determining whether or not it is j>ossible to distinguish one model or process from 
the other through some evaluation of response data. More specifically, our 
concern has been in establishing whether or not it is p>ossible to detect differences 
between these two MIRT models, either at the item response or test score level, 
when the item parameters from each MIRT model have been matched or equated 
in some sense. 

The first task was to establish the item parameters from one of the logistic 
MIRT models that would produce "reasonable" p-values or proportion-correct 
indices for a specified examinee population. Therefore, a target distribution of p- 
values for a 20-item test was conceived and item parameters for a linear or 
compensatory MIRT model were chosen, basically by trial-and-error, until the 
expected /? value with respect to this examinee population matched the target 
distribution- Table 1 gives the set of item parameters for tne 20 items for the 
model given by equation (1). The table also gives the expected value of each p- 
value under the assumption that the ability vector, 0, for the examinee population. 
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was distributed as bivariate normal with mean vector, 0, and variance-covariance 
matrix of ones along the diagonal and with nondiagonal values equal to rho 
(.00, .25, .50, or .75). All c-parameters were set to zero. 

Insert Table 1 Here 



In order to produce a comparable or "matched" set of noncompensatory, or 
multiplicative model item parameters, estimates of these item parameters were 
obtained by minimizing 



h [ p^ie^. a, d) u p^^ie, a, b) ]}' 0) 
i-i 



for = 2000 randomly selected examinees with ability, 6, distributed as given 
previously, where P(; and Pj^q represent logistic MIRT models given by equations 
(1) and (2), respectively. This process was repeated for 10 replications for each of 
k = 1, 2, 20 items to insure that the estimates obtained weren't unduly 
influenced by the samples selected or the starting values used. Mean values of 
the replication estimates yielded the noncompensatory item parameters listed in 
Tables 2-5, for rho values of .00, .25, .50, and .75. The expected value of each 
item's p-value is given in the last column of each table. Because the least squares 
minimization procedure produces unbiased estimates of Pj^q, the expected value of 
each p-^'ulue under the noncomj>enr,atory model should be equal to that of the 
compensatory model, within some estimation error. Equivalence of p-values was 
the critical matching criterion between the two MIRT models. 

Insert Tables 2-5 
Here 



io 



Model Differences at the Total Test Level 



By treating the two sets of item parameters as known for each of the two 
MIRT models, we first investigated the differences between expected number- 
correct score frequencies of a 20-item test when 8 was distributed as a bivariate 
normal random vector with distributions given previously. These frequencies were 
estimated by evaluating either the number-correct distribution under the 
compensatory model, hc{y) or the noncompensatory model, h^c(y), for y 
- 0, 1, 2, ... ,20, or 

W - [ [ /c(>' I 0) 8i&) de, ciQ, (4) 



and 

^Nc(y) - J J /Nc(y 1 0) ^(0) ■ ^'^^ 



In each case, the conditional frequencies, f^{y | e) and f^^iy ! 6) » were 
computed using either models (1) or (2), and a recursive procedure described by 
Lord and Wingersky (1984). Table 6 gives the signed differences between the 
frequencies, h^iy) - h^^^iy), fory = 0, 1, 2 , 20, for rho values of .00, .25, ,50, 
and ,75. The greatest differences, as expected, occurred for the highest number- 
correct scores, but the differences in frequencies were small, never greater than 
.015, For most number-correct score values, these differences became smaller as 
rho increased. 
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Another way to assess the significance of these differences was to 
determine how much data would need to be observed before the differences were 
statistically detectable. This was done by calculating the minimum sample size 
required to reject the homogeneity of parallel populations with given levels of test 
significance and power. These calculations assumed a multivariate normal 
approximation for each model's multinomial distribution of observed-score 
frequencies which in turn produced the quadratic form of the noncentrality 
parameter of a nonce ntral chi square distribution. The minimum sample size 
followed as a direct function of this parameter, the specified test significance, ynd 
power. For example, with a significance level of .01 and power equal to .95, the 
minimum sample sizes were 1678, 3242, 7466, and 15311 for correlated ability 
distributions with rho equal to .00, .25, .50, and .75, respectively. These sample 
sizes state that even in the unlikely event of uncorrelated ability distributions, it 
would still require at least 1678 observed scores from both the compensatory and 
noncompensatory MIRT models before the null hypothesis of model equivalence 
could be rejected with a power of .95. 

Insert Table 6 Here 



The first four (central) moments of each number-correct distribution arc 
given in Table 7 for each value of rho. Both distributions were negatively skewed 
with the compensatory distribution slightly more platykurtic and both were 
generally flatter than the normal distribution. The variances of the number- 
correct scores increased with an increase in rho, and in general, the distributions 
of number-correct scores became increasingly similar as rho increased. 

Insert Table 7 Here 



A contour plot of the (signed) difference between the number-correct true 
scores under the two models, or 



20 20 

HP «e) - zp. (8) 

was another way to observe model differences at the total test level for various 
(0j, 62) p>oints in the ability space- The greatest differences occurred when either 
0j or 62 was low. See Figures 1-4 for rho values of .00. 25, .50, and -75, 
respectively. It should be noted that, in these plots, the only influence of rho was 
through the values of the noncompensatory item parameters. Recall that the 
compensatory item parameters were fixed for all values of rho. Therefore, when 
interpreting these contour plots, one has to mentally superimpose the appropriate 
bivariate normal distribution over the contours in order to evaluate the 
: nportance of the true-score differences ob^^rved- 



Insert Figures 1-4 
Here 



Another way to compare the two MIRT models was to observe the amount 
of multidimensional information (MINF) for different points in the ability space 
between the two models. MINF has been defined (Reckase, 1986) as a direct 
generalization of the unidimensional IRT concept of 'tem information (i-e, the 
ratio of the square of the slope of the item characteristic curve at an ability point, 
6, to the variance of the error of the item score at that level of 6)- For the 
definition of MINF, the slope of the item characteristic surface must be evaluated 
in a particular direction, a, a vector of angles with the coordinate axes of the 
ability space. 

Plots of the absolute difference between the compensatory and 
noncompensatory test information vectors (i.e, the sum of item information across 



the 20 items) for item parameters estimated with rho values of .00, .25, .50, and 
,75 (Figures 5-8, respectively) showed that model differences might be significant 
if abilities were negatively correlated- However, for all "likely" ability 
distributions, there were no meaningful differences in MINF between the two 
models, and these absolute differences appeared to decrease as rho increased. 

Insert Figures 5-8 Here 



Model Differences at the Item Level 

It was also of interest to evaluate the differences between models at the 
single item response level. There were two ways in which this was done. The 
first involved the evaluation of the ideal observer index (Davey, Levine, & 
Williams, 1989; Levine, Drasgow, Williams, McCusker, & Thomasson, 1990). A 
more complete definition of this index is provided in the appendix of this paper. 
However, a simplifiea definition is as follows. The ideal observer index (lOI) is a 
measure of the proportional number of times that a correct decision is made 
concerning which of the two competing models produced a particular response to 
an item. The decision is one that is made hypothetically by an "ideal observer," or 
an individual who has access to all of the information necessary to yield the 
highest possible percent of model classification (i.e., compensatory vs. 
noncompensatory). As far as the ideal observer is concerned, if the item response 
data fail to distinguish between the two competing models, then the value of this 
index would be at or near the chance level of .5. Conversely, readily 
distinguishable models should yield an index near 1.0. 

Table 8 shows that the lOI was greater than chance, implying that there 
was a difference between the models for all 20 items. However, the lOI was 
never greater than .60 and was greater than .55 for only three items, numbers 3, 6, 



and 7, when rho was .00. The value of the lOI decreased for each item as rho 
increased, implying that it became more difficult to distinguish between the 
models as the correlation coefficient increased. 

One way to think of the magnitude of the lOI was to imagine how many 
trials of the lO experiment would be necessary before the ideal observer could 
ascertain, with some given level of certainty, that the models wtie actually 
distinguishable. This would be comparable to a test of the difference between any 
obtained lOI from Table 8 and the null proportion of correct model classifications 
due to chance. For example, to be able to detect a true difference between the 
models for item number 6 with a zero value of rho would require at least 40 trials 
of the lO experiment. This would be comparable to a test of the null proportion 
of correct classifications due to chance or .50 versus the (true) alternative 
proportion (.555) with a significance of .01 and power of .95. Conversely, a true 
lOI of .52 would require more than 290 trials at similar levels of test significance 
and power. 



Another way to evaluate model differences at the item level was to use a 
generalized MIRT model, or a reparameterization of both the compensatory and 
noncompensatory models into a single MIRT model, or 



Insert Table 8 Here 



f](e) - f- . (I-cp 



e 



(6) 



1 + e 



where ju represented an indicator variable such that 



0, for the linear or compensatory MIRT model, 



1, for the multiplicative or noncompensatory MIRT model. 
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Item response data, Xjj, were generated from samples of size N = 2000 of 
6; drawn from the bivariate normal distributions mentioned previously. The 
response data were known to have been produced by either the compensatory or 
noncompensatory MIRT model and were simulated by comparing the known 
values of P^i^^) to a pseudorandomly drawn uniform deviate, q, such that 

1, 0 < u) < z'/e.) 

0, Z'/O.) ^ < 1 . 



The least squares estimation procedure was used to estimate the 
generalized MIRT model parameters. Each estimation was replicated 10 times 
with randomly selected starting values. Either four or five unique item 
parameters were estimated from the generalized MIRT model, as given by 
equation (6). The same item parameters that were given in tables 1-5 were used 
to generate the response data for the estimation procedure. When the response 
data were generated by the compensatory model, a^, and d (i.e., d = - a^b^ - 
aj)2) as well as n, were estimated. When the response data were generated by 
the noncompensatory model, a^, Oj, b^, b2, and m were estimated. 

Table 9 shows the average bias in the item parameter estimates and the 
standard deviations of the estimates (in parentheses). For compensatory data, the 
model parameter, ^ was estimated fairly accurately for the uncorrelated situation, 
but the amount of bias and the standard deviation of the estimates increased as 
rho increased. A similar situation occurred with noncompensatory datii. 
However, although the amount of estimation error increased as the correlation 
between the abilities increased, the model still remained identifiable, in the sense 
that for compensatory data, the n estimates were statistically "close" to zero. 



11 

Likewise, for noncompensatory data, the /i estimates were satistically "close" to 
one- 
Insert Table 9 Here 



The lOI analysis and the generalized MIRT model estimation gave similar 
results. That is, there were model differences at the item level, but these 
differences tended to decrease as the correlation in abilities increased. The 
generalized MIRT analysis also suggested that these differences might still be 
estimable, however, even when abilities are strongly correlated. 



Summary and Conclusions 



These analyses and results seem to indicate that even though it is difficult 
to observe model differences at the overall test score level, there still may be 
measurable differences between the responses at the item level. Because the 
matching criterion between the two models resulted in similar expected /^-values, 
we anticipated small differences at the total test score response level, or at the 
true score level. The differences that were detected at this level were consistent 
with the differences implied in the two models. Fewer high, number-correct 
scores or estimated true scores were observed from the noncompensatory model, 
but these and other total test differences decreased as rho increased. As for the 
item response level analysis, both the lOI and the generalized MIRT model 
estimation showed that it is possible to quantify these differences and to 
distinguish between the data generated by carefully matched item response 
models of these two types. However, these differences, although real, are very 
small and probably not significant from any practical standpoint. 
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12 

Although it is difficult to generalize beyond the two-dimensional situation 
used in the present study, it would appear to be difficult to distinguish between 
the two models without the benefit of any prior knowledge of item parameters or 
abilities. Even with such prior knowledge, response data generated by the models 
are nearly indistinguishable, especially with correlated abilities, which is likely the 
case in many real testing situations. 



lb 
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Appendix 



Analytical Deflnition of the Ideal Observer Index 

A hypothetical observer is presented with two abilities, t, and tj, each with 
their associated item responses, Uj and Uj. The observer is informed that one 
ability-response pair was generated by one of two competing item response 
models, while the other pair was generated under the second model. The task is 
to correctly match each ability-response pair with the proper generating model- 
To make this decision, the observer is given access to both competing item 
response functions, Pj and , and the common ability distribution, f(t). 

An ideal observer bases this decision on an optimal rule, 6, which is 
determined by the ratio of likelihood functions, L^(tj,Uj) = Pi(tj)"^ Qi(tj)^'"j , where 
Q.(t.) = 1 - p^(tp, i = 1, 2; j = 1, 2. The decision rule, 6, is then defined as 

if Li(ti,Ui)-L2(t2,U2) > I^(t2,U2)"L2(ti,u,), then decide model {Pi;f} 
produced sample {ti,u,} while model {^iS) produced {t2,U2}- 

6 = 

if Li(t2,U2)'L2(ti,Ui) > I-^(tj,Ui)-L2(t2,U2), then decide model {?2,^ 
produced sample {ti,Ui} while model {Pi;f} produced {*2,U2}. 

The probability of this decision rule being correct, given the model, is 



Prob[«S correct I model] = Prob[Li(t„u,) * I^(t2,U2) > ^(t^Uj) • Mti.Ui)! {Pi;f}&{P2;01 + 

Prob[Li(t2,U2) •L2(ti,ui) > L,(tj,u,) ■ U{h,U2)\{?2;^&{?,'M 
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The response pair, u, where u = (Ui,U2), can be defined in four possible 
patterns: (1,1), (1,0), (0,1), and (0,0). Therefore, 

Prob[L,(ti,u,) -L2(t2,U2) > L,(t2,U2) - L2(t„Ui)| {Pi;f}&{P2;f}l = 

Prob[Pi(tO ■P2(t2) > Pi(t2) -P2(ti)|u = (l,l)] ■Prob[u = (l,l)|{Pi;f}&{P2;f}] 
+ Prob[Pi(ti) -Q,ii,) > Q,it,) ■P2(ti)|u = (l,0)] -Prob[u = (l,0)|{Pi;f}&{P2;f}] 
+ Prob[Q,(ti) > Pi(t2) -Q2(ti)|u = (0,l)] •Prob[u = (0,l)|{P,;f}&{P2;f}] 

+ Prob[Qi(ti) -02(12) > Qi(t2) -Q2(ti)lu = (0,0)] • Prob[n = (0,0)1 {Pi;f}&{P2;f}]. 

Define »r.^ - JJPi(t)"'Q,(t)^'"'P2(g)"JQ2(g)^""^ f(t) f(g) dt dg. 

Then, Prob[L,(ti,Ui) - L2(t2,U2) > Li(t2,U2) - L2(t„u,)l {P,;f}&{P2;f}] = 

T„ Prob[Pi(ti) ■P2(t2) > Pi(t2) •P2(t,)|u = (l,l)] ^ 
iTio Prob[Pi(ti) -Q2(t2) > Qi(t2) -P2(ti)lu = (l,0)] + 
^01 Prob[Qi(ti) •P2(t2) > Pi(t2) ■Q2(ti)|u = (0,l)] + 
TT^ Prob[Qi(ti) -02(12) > Oi(t2) -Q2(ti)|u = (0,0)]. 

Similarly, Prob[L,(t2,U2) -L2(ti,u,) > L,(ti,Ui) - L2(t2,U2)|{P2;f}&{Pi;f}] = 

Prob[Pi(t2) -P2(t,) > Pi(ti) -P2(t2)|u-(1,1)] -Problu = (l,l)|{P2;f}&{Pi;f}] 
+ ProbIP,(t2) -02(11) > Qi(ti) -P2(t2)|u = (l,0)] -Problu = (l,0)|{P2;f}&{Pi;f}] 
+ Prob[Oi(t2) -P2(ii) > P,(ti) ■O2(t2)|u = (0,l)] -Prob[u = (0,l)|{P2;f}&{Pi;n] 
+ Prob[Oi(t2) -Q2(t,) > Q,(ti) -02(12) I u = (0,0)] - Prob[u=(0,0)| {P2;f}&{P,;f}]. 
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Then, Prob[Li(t2,U2) -Lj(t„u,) > Li(t„u,) 'Uih^n^K^j^^&i^M = 

ir„ Prob[P,(t2) -P2(t,) > ?M •P2(t2)|u=(l,l)] + 
'10 Prob[P,(t2) 'Qiih) > Qi(ti) •P2(t2)|ii = (l,0)] + 
JTo, Prob[Qj(t2) -P2(ti) > Pi(t,) •Q2(t2)|u = (0,l)] + 
iToo Prob[Qj(t2) -QiOi) > Qi(ti) *Q2(t2)|u = (0,0)]. 

Let n^^^^ be defined as that region of the ability space where 

P.C,)'" • Q,(t,)"''; • P:(t,r ■ Q,{l^r-' > P,(g"^ ■ Q,(t,)"^-' • P,(t,)"' - Q,(t.)"' 
holds, and likewise let n„ be defined as that region of the ability space where 

p.c^r • Q,(g"'-' • ivt,)"' ■ Q,(t,)"'' > p,(t,)"' ■ Q,(t,)"'-' • ?,{i,r ■ Q,{t,r 

is true. Then 

Prob[P,(t,) ' P,(t,) > Pj(t,) . P,(tj)lu - (1,1)] - JJf(t) f(g) dt dg, 
Prob(P,(t,) -Q,(t2) > QjCt,) •P2(t,)lu . (1,0)] - JJ f(t) f(g) dt dg, 



Prob[Q,(tj) -P^d^) > P^(t,) -Q,ii,)\u - (0.1)] - JJf(t) f(g) dt dg, 
and 



Prob[Q,(tj) - Q,(t2) > Q,{X,) ' Q,(tj) I u - (0,0)] - J j f(t) f(g) dt dg. 

4oo 
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Then 

ProbtPi(g -^2(^1) > PiOi) -PiCt^)'" - (l'l)l - JJf(t) f(g) dt dg, 



'11 



Prob[P,(g - Q,{t,) > Qi(t,) • P^Cglu . (0,1)] . JJf(t) f(g) dt dg, 

«01 



Prob[Qi(t2) '?,{t,) > Q^Cg -Pi(ti)lu - (1,0)] - JJ f(t) f(g) dt dg, 

QlO 

and 



Prob[Qi(t^ -Q^Ct,) > Q,it,) 'Q^itjVu - (0,0)] - JJf(t) f(g) dt dg. 

%) 

Thus, Prob[6 correct | model] = 

'^iijjf(t)f(g)dtdg . ,r,oJ rf(t)f(g)dtdg . 

ToiJ rf(t)f(g)dtdg . ^ rf(t)f(g)dtdg . 

,r,,JJf(t)f(g)dtdg . ^,,JJf(t)f(g)dtdg . 

On Ooi 
%JJ f(t)f(g)dtdg . ^ooJJ^(t)f(g)dtdg 



or 



'^n ^ 'r.^ij rf(t)f(g)dtdg . |Jf(t)f(g)dtdg) 



Bio flol 



'Toi^J Jf(t)f(g)dtdg . JJf(t)f(g)dtdg} . 
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Finally, Prob[6 correct] = Prob[tf correct] model] • Probfselecting a 
model]. Because each model is equally likely, the probability of selecting a model 
is equal to ^. Thus, Prob[5 correct] ~ .5(Prob[5 correct] model]). 
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Table 1 

Original Item Parameters for the Compensatory Model 



E(/>-value) 
rho 



Item # 




2 


d 


.00 


.25 


.50 


75 


01 






-0^7 






An 
-•Hi 




02 












-JO 


.JO 


05 
\fj 




1 10 




SO 








04 


099 


too 




42 






.*tj 


05 


058 


165 


0 78 






•OX 


.01 


06 


0,91 


127 


0 42 


57 


57 


S7 

•^ / 


*JO 


07 


1,03 


0.95 


1.08 


69 




67 


<^7 


08 


032 


1X1 


038 


55 


S5 






09 


0,61 


0.72 


1.63 


.80 


.'79 


.79 


.78 


10 


0,67 


1.12 


0.60 


.61 


.61 


.60 


.60 


U 


0,91 


0.91 


-0.21 


.46 


.46 


.46 


.47 


12 


0.64 


1.72 


-0.05 


,49 


.49 


.49 


.49 


13 


1,65 


038 


0.40 


,57 


56 


.56 


.56 


14 


0,18 


1.61 


1.84 


.78 


.78 


.78 


.77 


15 


0,82 


1.02 


0.09 


J2 


52 


52 


.51 


16 


1,45 


0.81 


-0.24 


.46 


.46 


.46 


.46 


17 


1.64 


0.62 


0.85 


,64 


.63 


.63 


.62 


18 


0,77 


0.76 


-0.91 


32 


.33 


.34 


.34 


19 


1.46 


0.62 


O.IO 


.52 


.52 


52 


.52 


20 


039 


1.37 


032 


56 


56 


.55 


.55 



or 

O /LI) 

ERIC 



Table 2 

Item Parameters for the Noncompensatory Model with Rho « .00 



Item # aj 03 bj bj E(p-value) 



01 


\2b 


1.60 


-0,92 


-0.15 


38 


02 


2J0 


1.04 


038 


-2J28 


34 


03 


1.22 


139 


-1,42 


-0.99 


.59 


04 


1J2 


135 


-0.62 


-038 


.42 


05 


1.02 


1.82 


-2.71 


-0.62 


.62 


06 


1.25 


1.53 


-1.45 


-0.79 




07 


IJO 


1.26 


-1,48 


-1.63 


ji» 


08 


0.92 


238 


-3,95 


-0.22 


55 


09 


0.93 


1.00 


-2.75 


-235 


SQ 


10 


1.05 


1,37 


-1.96 


-0.90 


.61 


11 


1.24 


1.25 


-0.78 


-0.75 


.46 


12 


1.07 


1.92 


-2.17 


-0.19 


.49 


13 


1.81 


0.88 


-036 


-3.25 


«56 


14 


0.85 


1.67 


-5,26 


-1.17 


.78 


15 


1.17 


132 


-1.21 


-0.75 


.51 


16 


1.71 


1.23 


-OJ'' 


-135 


.45 


17 


1.83 


1.06 


-0.6. 


-2.55 


.63 


18 


1.09 


1,09 


-031 


-032 


32 


19 


1.69 


1.07 


-0.35 


-1.98 


.51 


20 


0.88 


1J4 


-2.98 


-0.41 


55 
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Table 3 

Item Parameters for the Noncompensatory Model with Rho = .25 



Item # ai bj bj E(/7-value) 



01 


1.38 


1.74 


-0.79 


-0.14 


39 


02 


2.40 


1.14 


0J5 


-1.88 


.34 


03 


1J6 


1.50 


-1.27 


-0.91 


.58 


04 


1.44 


1.45 


-0.56 


-0.51 


.42 


05 


1.17 


1.94 


-2J0 


-0.60 


.61 


06 


1.40 


1.66 


-1.28 


-0.73 


.56 


07 


1.45 


1.40 


-134 


-1.47 


.72 


08 


1.05 


2.47 


-3.30 


-0.22 


.55 


09 


1.02 


1.09 


-2.49 


-2.17 


.79 


10 


1.17 


1.47 


-1.72 


-0.85 


.60 


11 


1.34 


1.34 


-0.71 


0.68 


.46 


12 


1.21 


2.06 


-1.82 


-0.20 


.49 


13 


1.90 


0.98 


-0.36 


-2.80 


.56 


14 


0.93 


1.72 


-4.65 


-1.15 


.78 


15 


1.29 


1.42 


-1.08 


-0.69 


.51 


16 


1.84 


1.33 


-0.27 


-1.16 


.45 


17 


1.97 


1.20 


-0.66 


-2.19 


.62 


18 


1.15 


1.16 


-0.28 


-0.27 


.33 


19 


1.80 


1.18 


-0.35 


-1.71 


.51 


20 


0.98 


1.61 


-2.57 


-0.40 


.55 
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Table 4 

Item Parameters for the Noncompensatory Model with Rho = .50 



Item # 




^2 




b2 


E(/7 -value) 


01 




1 82 


-0 66 


-0.12 


39 






1 27 


0 12 


-1^1 


35 






k. »\JmJ 


-I 14 


-0 R5 










-0-50 


-0.4j 


.42 






204 


-1.97 


-0.59 


.61 


06 


1 55 


L79 


-1.13 


-0.68 


.56 


07 




L55 


-1.23 


-133 


.67 


08 


1 20 


2.51 


-2,78 


-0.22 


.55 




1 10 


1 17 


-230 


-2 01 


.78 


10 


1 2S 


1.56 


-1.53 


-0.80 


.60 


XI 


1.44 


1.43 


-0.64 


-0.61 


.46 


12 


1J6 


2.13 


-1.54 


-0.19 


.49 


13 


1.% 


1.09 


-036 


-239 


.56 


14 


1.03 


1.77 


-4.07 


-1.13 


.77 


15 


1J9 


1.51 


-0.97 


-0.64 


.51 


16 


1.95 


1.47 


-0.26 


-0.99 


.46 


17 


2.08 


1.35 


-0.63 


-1.89 


.62 


18 


1.21 


1.20 


-0.23 


-0.23 


.33 


19 


1.89 


1.30 


-0.34 


-1.46 


.51 


20 


1.08 


1.66 


-2.23 


-(1.40 


.55 
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Table 5 

Item Parameters for the Noncompensatory Model with Rlw 



= .75 



Item # 32 bi b2 E(p-value) 



01 


1.65 


1.92 


-OJl 


-0.10 


.40 


02 


153 


1.43 


OJl 


-1.14 


35 


03 


1.60 


1.73 


-1.01 


-0.77 


58 


04 


1.63 


1.64 


-0.42 


-039 


.43 


05 


1.48 


Z14 


-1.67 


-0J7 


.61 


06 


1.69 


1.92 


-0.98 


-0.62 


56 


07 


1.69 


1.66 


-1.13 


-1.21 


.66 


08 


1J6 


157 


-125 


-0.22 


55 


09 


1.15 


1.21 


-117 


-1.93 


.78 


10 


1.38 


1.63 


-136 


-0.76 


.60 


11 


IJO 


1.51 


-0J6 


-0.54 


.46 


12 


U3 


2.23 


-1.22 


-0.19 


.49 


13 


1.98 


1.26 


-0.36 


-1.99 


.56 


14 


1.15 


1.78 


-3.60 


-1.11 


.77 


15 


1.47 


1J9 


-0.85 


-OJ59 


.51 


16 


2.03 


1.63 


-0.23 


-0.81 


.46 


17 


2.15 


U3 


-0.61 


-1.60 


.62 


18 


1.24 


1.24 


-0.18 


-0.18 


.34 


19 


1.94 


1.44 


-033 


-1.23 


.51 


20 


1.17 


1.70 


-1.92 


-0.40 


.55 



.10 
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Table 6 

Compensatory Minus Noncompensatory Density Differences in Number-correct Score 



Number-correct 

score (y) .00 .25 .50 .75 



20 


.013 


.014 


,014 


.011 


19 


.015 


.012 


.009 


.004 


IS 


.012 


.007 


.003 


.000 


17 


.007 


.003 


.000 


-.002 


16 


.002 


-.001 


-.002 


-.003 


15 


-.003 


-.003 


-.004 


-.003 


14 


-.006 


-.005 


-.004 


-.003 


13 


-.009 


-.007 


-.005 


-.003 


12 


-.011 


-.007 


*.005 


-.0()2 


11 


-.012 


-.008 


-.005 


-.002 


10 


-.012 


-.008 


-.004 


-.001 


9 


-.011 


-.007 


-.004 




8 


-.009 


-.006 


-.003 


m) 


7 


-.006 


-.004 


-.002 


.001 


6 


-.003 


-.002 


-.001 


.001 


5 


.001 


.000 


.001 


.002 


4 


.005 


.002 


.002 


.002 


3 


.008 


.005 


.003 


.002 


2 


.009 


.006 


.004 


.001 


1 


.008 


.006 


.003 


-.001 


0 


-.005 


.003 


.001 


-.004 



ERIC 
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Table 7 

Central Moments of Number-correct Scores 



Second Third Fourth 

MIRT Models rho Mean Central Central Central 

Moment Moment Moment 

.00 10.90 25.79 -16J6 1362.83 

Compensatory .25 10.88 29.40 -20.44 1680.36 

.50 10.86 32.64 -24.01 1980.03 

.75 10.84 35.57 -27.27 2262.98 

.00 10.79 20.67 -9.42 946.49 

Noncompensatory -25 10.78 25.43 -15.86 1336.75 

.50 10.78 30.12 -24.30 176').64 

.75 10.78 34.70 -32.74 22(X).57 
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Table 8 

Ideal Observer Index 



rho 

Item# .00 .25 .50 .75 



01 


3479 


3397 


3295 


3179 


02 


3311 


3265 


3205 


3128 


03 


3513 


3418 


3307 


3183 


04 


3461 


3377 


3279 


3171 


05 


3421 


3353 


3265 


3157 


06 


3550 


.5451 


3332 


3194 


07 


3511 


3419 


3304 


.5175 


08 


3243 


3212 


3165 


3102 


09 


3276 


3227 


3162 


3092 


10 


3430 


3351 


3254 


3149 


11 


3435 


3355 


3260 


3156 


12 


.5448 


.5375 


.5281 


3166 


13 


.5291 


3246 


.5185 


3112 


14 


3124 


3109 


3082 


3048 


15 


3456 


3370 


3271 


3161 


16 


3497 


3411 


3307 


.5182 


17 


3442 


3371 


3276 


3162 


18 


3281 


3232 


.5175 


3114 


19 


3425 


.5352 


3260 


3156 


20 


3292 


3241 


3179 


3108 
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Table 9 

Average Bias (parameter estimate - true parameter) and Standard Deviation 
of Bias in Estimates of the Generalized MIRT Model Parameters 



Response 



Data Model 


rho 




^2 


d 








.00 


.044 
(.042) 


.024 
(.073) 


.069 
(.158) 




.013 
(.009) 


Compensatory 


.25 


.044 
(.047) 


.040 
(.042) 


.125 
(.275) 




.026 
(.052) 




.50 


.078 
(.055) 


.069 
(.081) 


.255 
(.238) 




.064 
(.060) 




.75 


.098 
(.128) 


.113 
(.080) 


.787 
(1.930) 




.107 
(.094) 



Noncompensatory 



.00 


-.008 


.009 


.130 


.230 


-.199 




(.099) 


(.115) 


(.448) 


(-354) 


(.163) 


.25 


-.006 


-.004 


.250 


.254 


-.197 




(.090) 


(.083) 


(.622) 


(.464) 


(.144) 


.50 


.039 


-.076 


.191 


.183 


-.200 




(.145) 


(.104) 


(.888) 


(.265) 


(.125) 


.75 


-.155 


-.059 


.071 


.250 


-.288 




(.220) 


(.105) 


(.439) 


(.421) 


(.175) 



Note: standard deviations arc in parentheses 
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Figure Captions 

Figure 1. Difference Between Compensatory and Noncompensatory True Scores: 
Rho = .00 

Figure 2. Difference Between Compensatory and Noncompensatory True Scores: 
Rho = ,25 

Figure 3. Difference Between Compensatory and Noncompensatory True Scores: 
Rho = .50 

Figure 4. Difference Between Compensatory and Noncompensatory True Scores: 
Rho = .75 

Fi^e 5. Absolute Difference Between Compensatory and Noncompensatory Test 
Information Vectors: Rho = .00 

Figure 6. Absolute Difference Between Compensatory and Noncompensatory Test 
Information Vectors: Rho = .25 

Figure 7. Absolute Difference Between Compensatory and Noncompensatory Test 
Information Vectors: Rho = . 50 

Figure 8. Absolute Difference Between Compensatory and Noncompensatory Test 
Information Vectors: Rho = .75 
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Dlfterence Between CompenMtory 
and NoncoTniynftatoiy True Scores: 

Kho = .TO 





Absolute DifDarence BetwBen Ckmmensatory 
and Noncompensatoiy Test Ixifbrmation Vectors: 

Rno = .00 
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Absolute DifDarence Between Compenntoxy 
azxd Noncompmiaatoiy Tsst Inlbrmttaon Vectors: 

Rno = 30 
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Absolute Diflbrence Betwieen Comprnigmtory 
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